In [1]:
import pandas as pd
import numpy as np
import matplotlib as mpl
import plotly as plotly
plotly.__version__
Out[1]:
'5.9.0'

Visualization of residential building permit value among Calgary's community¶

Introduction¶

This project will explore how the residential building permit value varies in different communities within Calgary. I am focusing on three types of residential building permits: commercial/multi-family, single construction, and residential improvement, to see if the value of these permits will reflect the housing trend (large apartment complex versus single-family home versus home improvement) in each community and how the trends change before, during, and after COVID-19 outbreak from 2018-2023. Since there are over 300 small communities around Calgary, we will divide those communities into eight zones (center, north, south, east, west, northwest, northeast, and southeast).

Task summary¶

There are two datasets that I used from the Open Calgary website. My main dataset is Building Permits, and my secondary dataset is Community Crime and Disorder Statistics(2012-2019), which I use to create the community-to-sector map dictionary. I filter the Building Permits dataset into three datasets based on three types of residential building permits: New-built Commercial/Multi-Family, New-built Single-Family, and Residential improvement.

Data wrangling tasks are dropping unused columns, changing some column types, creating community-to-sector maps, adding sector columns to the three residential permit datasets, and removing rows with missing information.

Data visualization tasks are grouping the data by year and sector, calculating the yearly permit value, preplot the data for verification, and creating tables for plotting stack bar charts in Plotly.

Load data and begin wrangling process¶

I use the built-in query service on Open Data Calgary to get three datasets for each residential permit type: New-built Commercial/Multi-Family, New-built Single-Family, and Residential improvement. I also downloaded Community Crime and Disorder Statistics(2012-2019) without query filtering.

Query description:

  • Filter by: IssuedDate,PermitClassMapped, PermitType, WorkClass, HousingUnits
  • Exported as Building_Permits6625L5Multi.csv, Building_Permits131191L5Single.csv, Building_Permits179416L3Improve.csv
In [2]:
# Load Commerical/Multi-Family permit, select columns, convert float Ward Boundaries column to string Ward column. 
source_df = pd.read_csv('Building_Permits6625L5Multi.csv')
Multi_df =  source_df[['IssuedDate', 'EstProjectCost', 'HousingUnits','CommunityName','Ward Boundaries']]
Ward = Multi_df['Ward Boundaries'].astype(str)
pd.DataFrame(Ward)
Ward = Ward.rename("Ward")
Multi_df = pd.concat([Multi_df,Ward],axis =1)
Multi_df = Multi_df.drop("Ward Boundaries",axis='columns')
display(Multi_df.head())

# repeat above process with Single Familty permit. I don't need HousingUnits because this count as 1 unit.
sources_df = pd.read_csv('Building_Permits131191L5Single.csv')
Single_df =  sources_df[['IssuedDate', 'EstProjectCost','CommunityName','Ward Boundaries']]
Wards = Single_df['Ward Boundaries'].astype(str)
pd.DataFrame(Wards)
Wards = Wards.rename("Ward")
Single_df = pd.concat([Single_df,Wards],axis =1)
Single_df = Single_df.drop("Ward Boundaries",axis='columns')
display(Single_df.head())

# repeat above process with Residential Improvement permit. I don't need HousingUnits because this count as 0 unit.
sourcei_df = pd.read_csv('Building_Permits179416L3Improve.csv')
Improv_df =  sourcei_df[['IssuedDate', 'EstProjectCost','CommunityName','Ward Boundaries']]
Wardi = Improv_df['Ward Boundaries'].astype(str)
pd.DataFrame(Wardi)
Wardi = Wardi.rename("Ward")
Improv_df = pd.concat([Improv_df,Wardi],axis =1)
Improv_df = Improv_df.drop("Ward Boundaries",axis='columns')
display(Improv_df.head())
IssuedDate EstProjectCost HousingUnits CommunityName Ward
0 10/2/23 580,925 2 SAGE HILL nan
1 10/2/23 1,409,387 4 SAGE HILL nan
2 10/2/23 1,366,160 2 PARKHILL nan
3 10/2/23 1,926,194 8 MAHOGANY nan
4 10/2/23 969,508 4 MAHOGANY nan
IssuedDate EstProjectCost CommunityName Ward
0 2022/11/28 373,884 CAMBRIAN HEIGHTS nan
1 2021/11/10 NaN WEST HILLHURST 7.0
2 2022/11/09 375,787 BELMONT nan
3 2022/05/13 315,574 AMBLETON 3.0
4 2022/11/30 259,719 MAHOGANY nan
IssuedDate EstProjectCost CommunityName Ward
0 10/2/23 3,942 CARRINGTON nan
1 10/2/23 5,000 MAYLAND HEIGHTS nan
2 10/2/23 55,549 CRANSTON nan
3 10/2/23 3,000 GLENDALE nan
4 10/2/23 46,677 AMBLETON nan

Now, I check if the EstProjectCost column of each dataset has NAN or 0. If so, I will remove them. Only single-family and residential improvement permit dataset has NAN in their EstProjectCost column. Then, I checked the CommunityName column of each dataset and found no NAN, so I am good to go.

In [3]:
Multi_df['EstProjectCost'].isnull().any()
Single_df['EstProjectCost'].isnull().any()
Improv_df['EstProjectCost'].isnull().any()

Improv_df = Improv_df.dropna(subset=['EstProjectCost'])
Single_df = Single_df.dropna(subset=['EstProjectCost'])

Multi_df['CommunityName'].isnull().any()
Single_df['CommunityName'].isnull().any()
Improv_df['CommunityName'].isnull().any()
Single_df['EstProjectCost'].isnull().any()
Improv_df['EstProjectCost'].isnull().any()
Out[3]:
False

Now I have to map community name to eight sectors, which is a bit tricky than mapping ward boundaries to eight sectors, because there are only 14 wards but over 300 communities. But there are a lot of missing data in the ward coulmn and none in community name I have no choice. Luckily, I can extract the communities and sectors map from Community Crime and Disorder Statistics 2012-2019 dataset which contain most of the communites in my Building Permit dataset. I filter unique community name along with Sector from the dataset, check for missing values, and change the Community Name columns to match the column name in my main dataset so I can merge them.

In [4]:
#Update Community to Sectors Map

Crime_df = pd.read_csv('Community_Crime_and_Disorder_Statistics__2012-2019_.csv')
CrimeUnique_df = Crime_df.drop_duplicates(subset = "Community Name")
Commumap_df = CrimeUnique_df[['Community Name','Sector']] 
Commumap_df = Commumap_df.rename(columns={"Community Name": "CommunityName"})
Commumap_df.isnull().any()
display(Commumap_df.head())

# Add sector column by merging with community map using community name as merge key
MultiWCommu_df = Multi_df.merge(Commumap_df, on='CommunityName', how='left').sort_values(by='CommunityName')
display(MultiWCommu_df)
CommunityName Sector
0 THORNCLIFFE NORTH
1 WOODBINE SOUTH
2 WILLOW PARK SOUTH
4 LINCOLN PARK WEST
5 RAMSAY CENTRE
IssuedDate EstProjectCost HousingUnits CommunityName Ward Sector
3809 7/10/13 849,474 4 ALBERT PARK/RADISSON HEIGHTS 12.0 EAST
3451 4/11/14 16,175,000 100 ALBERT PARK/RADISSON HEIGHTS 12.0 EAST
2722 9/7/16 2,223,322 23 ALBERT PARK/RADISSON HEIGHTS 12.0 EAST
1556 2/21/20 875,000 4 ALBERT PARK/RADISSON HEIGHTS 12.0 EAST
1658 11/5/19 900,000 4 ALBERT PARK/RADISSON HEIGHTS 12.0 EAST
... ... ... ... ... ... ...
127 6/26/23 834,684 4 YORKVILLE nan NaN
1675 10/25/19 1,135,741 4 YORKVILLE 6.0 NaN
129 6/26/23 827,467 4 YORKVILLE nan NaN
1814 6/19/19 1,114,523 4 YORKVILLE 6.0 NaN
1848 5/16/19 1,135,741 4 YORKVILLE 6.0 NaN

6626 rows × 6 columns

Seem like the Community to sector map still missing some community Name, I have to take a look and manaually update it. Looks like I am missing 10 community. So I search for missing community update it. So now I have created a new cvs file, Commap.cvs and use it to add sector column to all three dataset.

In [5]:
# Update Community map manually and reload it

Commu2SectorMap_df = pd.read_csv('Commumap.csv')

# Merge again and test if there still NaN in sector column

MultiWCommu_df = Multi_df.merge(Commu2SectorMap_df, on='CommunityName', how='left').sort_values(by='CommunityName')
narow = MultiWCommu_df.loc[MultiWCommu_df.Sector.isnull()]
display(narow)

SingleWCommu_df = Single_df.merge(Commu2SectorMap_df, on='CommunityName', how='left').sort_values(by='CommunityName')
narows = SingleWCommu_df.loc[SingleWCommu_df.Sector.isnull()]
display(narows)

ImprovWCommu_df = Improv_df.merge(Commu2SectorMap_df, on='CommunityName', how='left').sort_values(by='CommunityName')
narowi = ImprovWCommu_df.loc[ImprovWCommu_df.Sector.isnull()]
display(narowi)
IssuedDate EstProjectCost HousingUnits CommunityName Ward Sector
1359 11/6/20 770,785 4 Undefined 3.0 NaN
1481 6/23/20 1,992,860 10 Undefined 3.0 NaN
953 12/3/21 1,064,277 4 Undefined 4.0 NaN
1345 12/2/20 9,602,466 64 Undefined 3.0 NaN
1984 12/18/18 477,806 6 Undefined 6.0 NaN
1264 3/19/21 1,542,523 10 Undefined 6.0 NaN
2491 7/4/17 20,261,732 162 Undefined 4.0 NaN
1339 12/16/20 1,101,602 6 Undefined 3.0 NaN
939 12/13/21 1,546,871 6 Undefined 4.0 NaN
1268 3/16/21 907,963 5 Undefined 11.0 NaN
1272 3/9/21 809,223 5 Undefined 11.0 NaN
1986 12/18/18 567,662 8 Undefined 6.0 NaN
1301 2/9/21 916,680 5 Undefined 11.0 NaN
1140 6/15/21 13,133,900 60 Undefined 5.0 NaN
1987 12/18/18 325,791 4 Undefined 6.0 NaN
928 12/24/21 1,044,115 4 Undefined 4.0 NaN
1170 5/31/21 10,692,300 48 Undefined 5.0 NaN
2301 3/19/18 13,107,065 83 Undefined 4.0 NaN
1160 6/4/21 762,473 4 Undefined 6.0 NaN
1371 10/27/20 1,124,995 6 Undefined 3.0 NaN
843 3/17/22 1,515,280 9 Undefined 4.0 NaN
757 5/4/22 2,833,228 12 Undefined 4.0 NaN
1370 10/30/20 770,785 4 Undefined 3.0 NaN
1206 5/10/21 1,479,205 8 Undefined 3.0 NaN
937 12/14/21 1,546,871 6 Undefined 4.0 NaN
1207 5/7/21 1,479,205 8 Undefined 3.0 NaN
838 3/25/22 1,999,630 15 Undefined 4.0 NaN
1443 8/12/20 1,134,102 7 Undefined 6.0 NaN
950 12/7/21 1,827,288 7 Undefined 4.0 NaN
1447 8/10/20 1,025,547 6 Undefined 6.0 NaN
904 1/26/22 11,742,885 47 Undefined 4.0 NaN
905 1/26/22 6,118,078 47 Undefined 4.0 NaN
1365 11/4/20 947,765 5 Undefined 3.0 NaN
800 4/22/22 3,244,257 12 Undefined 4.0 NaN
2007 12/5/18 1,500,000 4 Undefined 7.0 NaN
IssuedDate EstProjectCost CommunityName Ward Sector
4854 2023/07/18 326,677 Undefined nan NaN
3598 2023/06/20 427,996 Undefined nan NaN
2421 2023/05/31 262,511 Undefined nan NaN
529 2023/07/26 339,427 Undefined nan NaN
3021 2023/06/09 262,511 Undefined nan NaN
1953 2023/06/12 274,500 Undefined nan NaN
2069 2023/06/01 275,962 Undefined nan NaN
2775 2023/07/27 350,085 Undefined nan NaN
1654 2023/03/23 202,247 Undefined nan NaN
4950 2023/07/18 312,069 Undefined nan NaN
2458 2023/06/21 274,500 Undefined nan NaN
1223 2023/03/03 166,041 Undefined nan NaN
348 2023/07/26 339,427 Undefined nan NaN
297 2023/07/20 326,506 Undefined nan NaN
1850 2023/07/18 329,513 Undefined nan NaN
2655 2023/03/23 355,375 Undefined nan NaN
5141 2023/07/18 312,473 Undefined nan NaN
4934 2023/06/10 422,102 Undefined nan NaN
1807 2023/03/23 165,854 Undefined nan NaN
3054 2023/03/17 165,854 Undefined nan NaN
1864 2023/07/18 306,008 Undefined nan NaN
300 2023/05/31 432,076 Undefined nan NaN
3328 2023/06/02 272,168 Undefined nan NaN
4917 2023/06/20 272,168 Undefined nan NaN
2419 2023/05/31 262,511 Undefined nan NaN
2413 2023/09/28 327,038 Undefined nan NaN
5321 2023/06/29 271,790 Undefined nan NaN
3633 2023/07/17 240,560 Undefined nan NaN
3130 2023/03/17 202,419 Undefined nan NaN
2968 2023/07/11 356,430 Undefined nan NaN
2063 2023/06/15 275,530 Undefined nan NaN
5262 2023/06/29 311,721 Undefined nan NaN
IssuedDate EstProjectCost CommunityName Ward Sector
3078 6/12/23 40,505 Undefined nan NaN
2730 6/21/23 44,555 Undefined nan NaN

The Undefined and NaN community name causes the NaN in the column sector. Only the Muti-Commercial dataset has ward information for those Undefined communities. Since the Single-Family dataset has 32 entries with no community name and ward information, and the Residential improvement dataset has 14, I decided to remove those entries. However, since the Muti-Commercial dataset has ward information, I decided to replace Undefined CommunityName with their ward number and add the ward number into my Commu2SectorMap data frame so I can merge them again to get rid of NaN in the sector column.

I also convert the AppliedDate column to Pandas date-time type and the EstProjectCost column to integer.

In [6]:
#replacing the Undefined community name with Ward.
Multi_df.CommunityName =np.where(Multi_df.CommunityName == 'Undefined',
                                       Multi_df['Ward'],Multi_df.CommunityName)
# Create Ward dictinary and add it into the Community to Sector Map
WardDict = {'CommunityName':['1.0','2.0','3.0','4.0','5.0','6.0','7.0','8.0','9.0','10.0','11.0','12.0','13.0','14.0'],
            'Sector':['NORTHWEST','NORTH','NORTH','NORTH','NORTHEAST','WEST','CENTRE','CENTRE','EAST','NORTHEAST',
                     'SOUTH','SOUTHEAST','SOUTH','SOUTH']}
WardDict_df = pd.DataFrame(WardDict)
Commu2SectorMap_df = pd.concat([Commu2SectorMap_df, WardDict_df], ignore_index = True)

# Merge again to get rid of NaN in Sector column of MutiWCommu. Don't need to do this with other 2 since I have to drop it
MultiWCommu_df = Multi_df.merge(Commu2SectorMap_df, on='CommunityName', how='left').sort_values(by='CommunityName')

# Drop the row with Undefined community and missing ward information 
SingleWCommu_df = SingleWCommu_df.dropna(subset=['Sector'])

# Drop the row with Undefined community and missing ward information
ImprovWCommu_df = ImprovWCommu_df.dropna(subset=['Sector'])

# Change issue date to date-time type
MultiWCommu_df['IssuedDate'] = pd.to_datetime(MultiWCommu_df['IssuedDate']) 
SingleWCommu_df['IssuedDate'] = pd.to_datetime(SingleWCommu_df['IssuedDate'])
ImprovWCommu_df['IssuedDate'] = pd.to_datetime(ImprovWCommu_df['IssuedDate'])

# Change the EstProjectCost column from string to int. I need to remove the comma first so I can convert to int
MultiWCommu_df['EstProjectCost'] = MultiWCommu_df['EstProjectCost'].str.split(',').str.join('').astype(int)
SingleWCommu_df['EstProjectCost'] = SingleWCommu_df['EstProjectCost'].str.split(',').str.join('').astype(int)
ImprovWCommu_df['EstProjectCost'] = ImprovWCommu_df['EstProjectCost'].str.split(',').str.join('').astype(int)

display(MultiWCommu_df)
display(SingleWCommu_df)
display(ImprovWCommu_df)
IssuedDate EstProjectCost HousingUnits CommunityName Ward Sector
1268 2021-03-16 907963 5 11.0 11.0 SOUTH
1272 2021-03-09 809223 5 11.0 11.0 SOUTH
1301 2021-02-09 916680 5 11.0 11.0 SOUTH
1206 2021-05-10 1479205 8 3.0 3.0 NORTH
1371 2020-10-27 1124995 6 3.0 3.0 NORTH
... ... ... ... ... ... ...
1863 2019-05-02 1158342 4 YORKVILLE 6.0 SOUTH
1878 2019-04-11 681195 4 YORKVILLE 6.0 SOUTH
1879 2019-04-11 1125554 4 YORKVILLE 6.0 SOUTH
1848 2019-05-16 1135741 4 YORKVILLE 6.0 SOUTH
1332 2020-12-30 1461013 8 YORKVILLE 6.0 SOUTH

6626 rows × 6 columns

IssuedDate EstProjectCost CommunityName Ward Sector
80714 2011-02-25 151527 ABBEYDALE 12.0 NORTHEAST
80037 2011-07-14 441365 ACADIA 10.0 SOUTH
91126 2013-05-30 362893 ACADIA 10.0 SOUTH
106298 2016-07-26 323962 ACADIA 10.0 SOUTH
103754 2015-10-23 301952 ACADIA 10.0 SOUTH
... ... ... ... ... ...
1463 2023-09-21 353680 YORKVILLE nan SOUTH
121788 2020-11-18 193632 YORKVILLE 6.0 SOUTH
5336 2023-08-02 432043 YORKVILLE nan SOUTH
123918 2021-02-02 194845 YORKVILLE 6.0 SOUTH
125150 2021-06-28 432043 YORKVILLE 6.0 SOUTH

130736 rows × 5 columns

IssuedDate EstProjectCost CommunityName Ward Sector
91844 2012-10-18 15000 ABBEYDALE 12.0 NORTHEAST
2055 2023-07-11 2000 ABBEYDALE nan NORTHEAST
172338 2000-08-30 9570 ABBEYDALE 12.0 NORTHEAST
148432 2005-05-17 9870 ABBEYDALE 12.0 NORTHEAST
72803 2015-02-26 122500 ABBEYDALE 12.0 NORTHEAST
... ... ... ... ... ...
5811 2023-02-21 28298 YORKVILLE nan SOUTH
14502 2022-02-24 34798 YORKVILLE 6.0 SOUTH
34746 2020-01-28 23900 YORKVILLE 6.0 SOUTH
26010 2020-11-18 12858 YORKVILLE 6.0 SOUTH
24173 2021-02-12 13755 YORKVILLE 6.0 SOUTH

175126 rows × 5 columns

Visualization begins¶

Finally, my data-wrangling tasks are a success! Now, I will start to prepare my data for visualization.

In [7]:
# Group by Year so I can get yearly EstProjectCost for each sector
MultiYear = MultiWCommu_df.groupby([MultiWCommu_df.IssuedDate.dt.year,'Sector'],as_index=True)['EstProjectCost'].sum()
display(MultiYear)
SingleYear = SingleWCommu_df.groupby([SingleWCommu_df.IssuedDate.dt.year,'Sector'],as_index=True)['EstProjectCost'].sum()
display(SingleYear)
ImprovYear = ImprovWCommu_df.groupby([ImprovWCommu_df.IssuedDate.dt.year,'Sector'],as_index=True)['EstProjectCost'].sum()
display(ImprovYear)
IssuedDate  Sector   
2000        CENTRE       114337522
            EAST          16098198
            NORTH          7616974
            NORTHEAST      7357570
            NORTHWEST     25252600
                           ...    
2023        NORTHEAST    111710577
            NORTHWEST    117496901
            SOUTH        165090967
            SOUTHEAST    137599026
            WEST         318603071
Name: EstProjectCost, Length: 189, dtype: int64
IssuedDate  Sector   
2000        CENTRE        23228708
            EAST           1557527
            NORTH         91507117
            NORTHEAST     84846542
            NORTHWEST    194974840
                           ...    
2023        NORTHEAST    182845981
            NORTHWEST     94970150
            SOUTH        233156378
            SOUTHEAST    184531020
            WEST          56401500
Name: EstProjectCost, Length: 192, dtype: int64
IssuedDate  Sector   
2000        CENTRE       12796193
            EAST          1970782
            NORTH         5091377
            NORTHEAST     4867353
            NORTHWEST     7278539
                           ...   
2023        NORTHEAST    54223545
            NORTHWEST    35124127
            SOUTH        57943309
            SOUTHEAST    36279922
            WEST         26458502
Name: EstProjectCost, Length: 192, dtype: int64

I will first plot them seperately to see the overall scale.

In [8]:
import matplotlib.pyplot as plt

plt.suptitle('Muti-Commercial Residential Permit value by sector')

plt.subplot(2,3,1)
MultiYear[2023].plot.bar()
plt.title('2023')
plt.xticks(visible=False)
plt.xlabel('')
plt.ylabel('Permit value $')

plt.subplot(2,3,2)
MultiYear[2022].plot.bar()
plt.title('2022')
plt.xticks(visible=False)
plt.xlabel('')

plt.subplot(2,3,3)
MultiYear[2021].plot.bar()
plt.title('2021')
plt.xticks(visible=False)
plt.xlabel('')

plt.subplot(2,3,4)
MultiYear[2020].plot.bar()
plt.title('2020')
plt.ylabel('Permit value $')

plt.subplot(2,3,5)
MultiYear[2019].plot.bar()
plt.title('2019')

plt.subplot(2,3,6)
MultiYear[2018].plot.bar()
plt.title('2018')
plt.tight_layout()
In [9]:
plt.suptitle('Single-Family Residential Permit value by sector')

plt.subplot(2,3,1)
SingleYear[2023].plot.bar()
plt.title('2023')
plt.xticks(visible=False)
plt.xlabel('')
plt.ylabel('Permit value $')

plt.subplot(2,3,2)
SingleYear[2022].plot.bar()
plt.title('2022')
plt.xticks(visible=False)
plt.xlabel('')

plt.subplot(2,3,3)
SingleYear[2021].plot.bar()
plt.title('2021')
plt.xticks(visible=False)
plt.xlabel('')

plt.subplot(2,3,4)
SingleYear[2020].plot.bar()
plt.title('2020')
plt.ylabel('Permit value $')

plt.subplot(2,3,5)
SingleYear[2019].plot.bar()
plt.title('2019')

plt.subplot(2,3,6)
SingleYear[2018].plot.bar()
plt.title('2018')
plt.tight_layout()
In [10]:
plt.suptitle('Residential Improvement Permit value by sector')

plt.subplot(2,3,1)
ImprovYear[2023].plot.bar()
plt.title('2023')
plt.xticks(visible=False)
plt.xlabel('')
plt.ylabel('Permit value $')

plt.subplot(2,3,2)
ImprovYear[2022].plot.bar()
plt.title('2022')
plt.xticks(visible=False)
plt.xlabel('')

plt.subplot(2,3,3)
ImprovYear[2021].plot.bar()
plt.title('2021')
plt.xticks(visible=False)
plt.xlabel('')

plt.subplot(2,3,4)
ImprovYear[2020].plot.bar()
plt.title('2020')
plt.ylabel('Permit value $')

plt.subplot(2,3,5)
ImprovYear[2019].plot.bar()
plt.title('2019')

plt.subplot(2,3,6)
ImprovYear[2018].plot.bar()
plt.title('2018')
plt.tight_layout()

Now, I will create tables for each year with three columns(Sector, Residential Permit Type, Total Permit Value) and 24 rows and plot the permit value of three permit types together in one plot for each year.

I will also create line plots to show each type of permit value over time for the 8 eight sectors. To do this, I will create tables for each permit type with three columns( Year, Sector, Total Permit Value) and 48 rows.

In [11]:
S2018 = SingleYear[2018][0:8]
S2019 = SingleYear[2019][0:8]
S2020 = SingleYear[2020][0:8]
S2021 = SingleYear[2021][0:8]
S2022 = SingleYear[2022][0:8]
S2023 = SingleYear[2023][0:8]
Svec = ['Single-Family']*8

M2018 = MultiYear[2018][0:8]
M2019 = MultiYear[2019][0:8]
M2020 = MultiYear[2020][0:8]
M2021 = MultiYear[2021][0:8]
M2022 = MultiYear[2022][0:8]
M2023 = MultiYear[2023][0:8]
Mvec = ['Muti-Commercial']*8

I2018 = ImprovYear[2018][0:8]
I2019 = ImprovYear[2019][0:8]
I2020 = ImprovYear[2020][0:8]
I2021 = ImprovYear[2021][0:8]
I2022 = ImprovYear[2022][0:8]
I2023 = ImprovYear[2023][0:8]
Ivec = ['Residential Improvement']*8

ALLsector_vec = ['CENTRE','EAST','NORTH','NORTHEAST','NORTHWEST','SOUTH','SOUTHEAST','WEST']*3
ALLtype_vec = Svec + Mvec + Ivec

ALLvalue_2018 = pd.concat([S2018,M2018,I2018], axis = 0)
ALLvalue_2018list = ALLvalue_2018.tolist()
Table2018 = pd.DataFrame(list(zip(ALLsector_vec,ALLtype_vec,ALLvalue_2018list)),
                         columns =['Sector','Residential Permit Type','Total Permit Value'])

ALLvalue_2019 = pd.concat([S2019,M2019,I2019], axis = 0)
ALLvalue_2019list = ALLvalue_2019.tolist()
Table2019 = pd.DataFrame(list(zip(ALLsector_vec,ALLtype_vec,ALLvalue_2019list)),
                         columns =['Sector','Residential Permit Type','Total Permit Value'])

ALLvalue_2020 = pd.concat([S2020,M2020,I2020], axis = 0)
ALLvalue_2020list = ALLvalue_2020.tolist()
Table2020 = pd.DataFrame(list(zip(ALLsector_vec,ALLtype_vec,ALLvalue_2020list)),
                         columns =['Sector','Residential Permit Type','Total Permit Value'])

ALLvalue_2021 = pd.concat([S2021,M2021,I2021], axis = 0)
ALLvalue_2021list = ALLvalue_2021.tolist()
Table2021 = pd.DataFrame(list(zip(ALLsector_vec,ALLtype_vec,ALLvalue_2021list)),
                         columns =['Sector','Residential Permit Type','Total Permit Value'])

ALLvalue_2022 = pd.concat([S2022,M2022,I2022], axis = 0)
ALLvalue_2022list = ALLvalue_2022.tolist()
Table2022 = pd.DataFrame(list(zip(ALLsector_vec,ALLtype_vec,ALLvalue_2022list)),
                         columns =['Sector','Residential Permit Type','Total Permit Value'])

ALLvalue_2023 = pd.concat([S2023,M2023,I2023], axis = 0)
ALLvalue_2023list = ALLvalue_2023.tolist()
Table2023 = pd.DataFrame(list(zip(ALLsector_vec,ALLtype_vec,ALLvalue_2023list)),
                         columns =['Sector','Residential Permit Type','Total Permit Value'])
In [12]:
import plotly.express as px
from plotly.subplots import make_subplots

# plot the long (tidy) dataframe
fig18 = px.bar(Table2018, x="Sector", y="Total Permit Value", color="Residential Permit Type",text_auto='.4s', title="2018 Permit Value", barmode='stack')
fig18.update_layout(xaxis_title='Sector', yaxis=dict(tickformat="$",))
fig18.show()
/Users/annesunrawee/opt/anaconda3/lib/python3.9/site-packages/scipy/__init__.py:155: UserWarning: A NumPy version >=1.18.5 and <1.25.0 is required for this version of SciPy (detected version 1.26.1
  warnings.warn(f"A NumPy version >={np_minversion} and <{np_maxversion}"
In [13]:
fig19 = px.bar(Table2019, x="Sector", y="Total Permit Value", color="Residential Permit Type",text_auto='.4s', title="2019 Permit Value", barmode='stack')
fig19.update_layout(xaxis_title='Sector', yaxis=dict(tickformat="$",))
fig19.show()
In [14]:
fig20 = px.bar(Table2020, x="Sector", y="Total Permit Value", color="Residential Permit Type",text_auto='.4s', title="2020 Permit Value", barmode='stack')
fig20.update_layout(xaxis_title='Sector', yaxis=dict(tickformat="$",))
fig20.show()
In [15]:
fig21 = px.bar(Table2021, x="Sector", y="Total Permit Value", color="Residential Permit Type",text_auto='.4s', title="2021 Permit Value", barmode='stack')
fig21.update_layout(xaxis_title='Sector', yaxis=dict(tickformat="$",))
fig21.show()
In [16]:
fig22 = px.bar(Table2022, x="Sector", y="Total Permit Value", color="Residential Permit Type",text_auto='.4s', title="2022 Permit Value", barmode='stack')
fig22.update_layout(xaxis_title='Sector', yaxis=dict(tickformat="$",))
fig22.show()
In [17]:
fig23 = px.bar(Table2023, x="Sector", y="Total Permit Value", color="Residential Permit Type",text_auto='.4s', title="2023 Permit Value", barmode='stack')
fig23.update_layout(xaxis_title='Sector', yaxis=dict(tickformat="$",))
fig23.show()
In [18]:
ALLsectoryear_vec = ['CENTRE','EAST','NORTH','NORTHEAST','NORTHWEST','SOUTH','SOUTHEAST','WEST']*6
Allyear_vec = ['2018']*8 + ['2019']*8 + ['2020']*8 + ['2021']*8 + ['2022']*8 + ['2023']*8

ALLvalue_S = pd.concat([S2018,S2019,S2020,S2021,S2022,S2023], axis = 0)
ALLvalue_Slist = ALLvalue_S.tolist()
TableS = pd.DataFrame(list(zip(Allyear_vec,ALLsectoryear_vec,ALLvalue_Slist)),
                         columns =['Year','Sector','Total Permit Value'])

ALLvalue_M = pd.concat([M2018,M2019,M2020,M2021,M2022,M2023], axis = 0)
ALLvalue_Mlist = ALLvalue_M.tolist()
TableM = pd.DataFrame(list(zip(Allyear_vec,ALLsectoryear_vec,ALLvalue_Mlist)),
                         columns =['Year','Sector','Total Permit Value'])

ALLvalue_I = pd.concat([I2018,I2019,I2020,I2021,I2022,I2023], axis = 0)
ALLvalue_Ilist = ALLvalue_I.tolist()
TableI = pd.DataFrame(list(zip(Allyear_vec,ALLsectoryear_vec,ALLvalue_Ilist)),
                         columns =['Year','Sector','Total Permit Value'])
In [19]:
figS = px.line(TableS, x="Year", y="Total Permit Value", color="Sector",text ='Total Permit Value', title="Single-Family Permit Value")
figS.update_layout(xaxis_title='Year', yaxis=dict(tickformat="$",))
figS.show()
In [20]:
figM = px.line(TableM, x="Year", y="Total Permit Value", color="Sector",text ='Total Permit Value', title="Multi-Commercial Permit Value")
figM.update_layout(xaxis_title='Year', yaxis=dict(tickformat="$",))
figM.show()
In [21]:
figI = px.line(TableI, x="Year", y="Total Permit Value", color="Sector",text ='Total Permit Value', title="Residential Improvement Permit Value")
figI.update_layout(xaxis_title='Year', yaxis=dict(tickformat="$",))
figI.show()

Conclusion¶

I will summarize the findings to my guiding question: What is the relationship between residential building permit value and different communities?

  • The EAST sector has the smallest permit value of all types because it is the smallest sector. However, the WEST sector, the second smallest sector, has a much higher permit value than the EAST sector.

  • The CENTRE sector has much higher commercial permit value due to the concentration of large-scale commercial apartments and condo complexes surrounding downtown and the SAIT campus.

  • The NORTHWEST sector shares the same trend with the CENTRE center, with the commercial permit value higher than the single-family permit value, partly due to the large-scale apartment and condo complex surrounding the University of Calgary campus.

  • The NORTH and NORTHEAST sectors have led in single-family permit value since 2018, with many new communities expanding to the north, such as Ambleton, Livingston, and Glacier Ridge. SOUTH and SOUTHEAST sectors start to catch up with them in 2021, with new neighborhoods expanding to the south, such as Seton and Legacy.

  • Permit value of the new-built single-family home and new-built commercial residential buildings dominate the Calgary real estate market, with residential improvement permit value being an order of magnitude less.

  • Residential Improvement permit value increased in 2020 for all sectors except the CENTRE sector, where the permit value slightly decreased. The SOUTH sector dominates in the residential improvement permit value throughout and peaks in 2021 and 2022.

  • During the COVID-19 outbreak, 2020 saw a drop in single-family permit value in 4 sectors(NORTH, SOUTHEAST, CENTRE, WEST) while slightly increasing in the other four sectors. Commercial residential building permit value dropped in 2020 in 6 sectors, with a sharp drop in the CENTRE and NORTH sectors, while the other two sectors (SOUTHEAST and WEST) increased.

  • Both single-family and commercial residential building permit values increased in 2021 across all sectors, except for a slight drop in the commercial residential building permit value in the SOUTH sector.

  • Single-family permit value increased sharply in the NORTHEAST, NORTH, SOUTH, and SOUTHEAST in 2021. Commercial residential building permit values increase sharply in the CENTRE and NORTH sectors.

  • Single-family permit value drop in 2023 in all sectors. Commercial residential building permit values decreased in most sectors except increasing in the WEST and EAST sectors.

In [ ]: